-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize backend docker image #1103
Conversation
…ade from v0 to v2
WalkthroughThe pull request introduces updates to the workflow configuration for deploying the backend to Google Cloud Run, enabling manual execution with input parameters. It includes updates to several action versions and adds a step to delete the Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@thainguyensunya fyi ~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (4)
.github/workflows/gcp_backend.yml (3)
Line range hint
1-41
: Approve workflow trigger changes with a minor suggestion.The transition to a manually triggered workflow with input parameters for environment and branch selection is a good improvement. It provides more control over deployments and allows for greater flexibility.
The input validation step is a nice addition to ensure only valid environments are used. Consider adding a similar validation for the branch input to ensure it exists in the repository.
Here's a suggested addition to validate the branch:
- name: Validate Branch Input run: | git fetch origin ${{ github.event.inputs.branch }} if ! git rev-parse --verify origin/${{ github.event.inputs.branch }} &>/dev/null; then echo "Invalid branch: ${{ github.event.inputs.branch }}. Branch does not exist in the repository." exit 1 fi🧰 Tools
🪛 actionlint
44-44: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 yamllint
[error] 51-51: trailing spaces
(trailing-spaces)
[warning] 79-79: too many blank lines
(1 > 0) (empty-lines)
55-67
: Approve Docker build and push improvements with a suggestion.The refactoring of the Docker build and push process using
docker/build-push-action@v6
is an excellent improvement. The explicit definition of build context and Dockerfile path enhances clarity, while the addition of caching options can significantly improve build performance.To further improve the workflow, consider adding a step to prune old images from the registry to manage storage and costs.
Here's a suggested addition after the build and push step:
- name: Prune old images run: | gcloud container images list-tags gcr.io/${{ vars.GCP_PROJECT_ID }}/${{ env.SERVICE }} \ --format='get(digest)' --filter='NOT tags:latest' | \ xargs -I {} gcloud container images delete gcr.io/${{ vars.GCP_PROJECT_ID }}/${{ env.SERVICE }}@{} --quietThis step will remove all images that are not tagged as 'latest', helping to manage storage in your container registry.
51-51
: Minor formatting improvements.There are a couple of minor formatting issues in the file:
- Remove the trailing space on line 51.
- Remove the extra blank line at the end of the file (line 79).
These changes will improve the overall cleanliness of the code without affecting its functionality.
Also applies to: 79-79
🧰 Tools
🪛 yamllint
[error] 51-51: trailing spaces
(trailing-spaces)
backend/Dockerfile (1)
14-14
: Optimize image size by cleaning up apt cachesAfter installing packages with
apt-get
, it's good practice to clean up the local repository of retrieved package files to reduce the image size. This can be done by removing the contents of/var/lib/apt/lists
.Apply this diff to clean up apt caches:
- RUN apt-get update && apt-get -y install ffmpeg curl unzip + RUN apt-get update && apt-get -y install ffmpeg curl unzip && rm -rf /var/lib/apt/lists/*
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- .github/workflows/gcp_backend.yml (1 hunks)
- backend/Dockerfile (1 hunks)
🧰 Additional context used
🪛 actionlint
.github/workflows/gcp_backend.yml
44-44: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 yamllint
.github/workflows/gcp_backend.yml
[error] 51-51: trailing spaces
(trailing-spaces)
[warning] 79-79: too many blank lines
(1 > 0) (empty-lines)
🔇 Additional comments (8)
.github/workflows/gcp_backend.yml (2)
46-51
: Approve Google Auth action update.Updating the Google Auth action from v0 to v2 is a good practice. This ensures you're using the latest features and security improvements provided by the action.
🧰 Tools
🪛 yamllint
[error] 51-51: trailing spaces
(trailing-spaces)
52-54
: Approve addition of GCR login step.The addition of the GCR login step is necessary and correctly implemented. This ensures that the workflow can push and pull Docker images from Google Container Registry.
backend/Dockerfile (6)
3-4
: Correct setup of virtual environment and PATHThe creation of the virtual environment and updating the
PATH
environment variable are correctly implemented, ensuring that the application uses the isolated environment for dependencies.
7-7
: Efficient installation of dependencies with pipUsing
--no-cache-dir
and--upgrade
flags optimizes the dependency installation process by preventing cache storage and ensuring the latest packages are installed.
9-9
: Switching to slim image for reduced final image sizeUsing
python:3.11-slim
in the final stage is a good choice to minimize the size of the Docker image, which can lead to faster deployments and reduced resource usage.
11-12
: Setting working directory and PATH in the final stageSetting the
WORKDIR
to/app
and updating thePATH
ensures that the application runs in the correct directory with access to the virtual environment's binaries.
16-17
: Copying virtual environment and application codeCopying the virtual environment and the application code into the final image is correctly handled, ensuring that all dependencies and application files are included for runtime.
20-20
: Application entrypoint command remains appropriateThe
CMD
instruction properly starts the application usinguvicorn
, specifying the host and port, which aligns with the previous configuration.
.github/workflows/gcp_backend.yml
Outdated
- name: Deploy to Cloud Run | ||
id: deploy | ||
uses: google-github-actions/deploy-cloudrun@v0 | ||
uses: google-github-actions/deploy-cloudrun@v2 | ||
with: | ||
service: ${{ env.SERVICE }} | ||
region: ${{ env.REGION }} | ||
image: gcr.io/${{ vars.GCP_PROJECT_ID }}/${{ env.SERVICE }} | ||
|
||
# If required, use the Cloud Run url output in later steps | ||
- name: Show Output | ||
run: echo ${{ steps.deploy.outputs.url }} No newline at end of file | ||
run: echo ${{ steps.deploy.outputs.url }} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve Cloud Run deployment action update and suggest checkout action update.
Updating the Cloud Run deployment action from v0 to v2 is a good practice. This ensures you're using the latest features and security improvements provided by the action.
Additionally, the static analysis tool has flagged that the actions/checkout@v2
action is outdated. To address this:
Update the checkout action to the latest version:
- uses: actions/checkout@v2
+ uses: actions/checkout@v4
This will ensure you're using the most up-to-date version of the checkout action, which may include important bug fixes and performance improvements.
Committable suggestion was skipped due to low confidence.
🧰 Tools
🪛 yamllint
[warning] 79-79: too many blank lines
(1 > 0) (empty-lines)
@@ -1,16 +1,21 @@ | |||
FROM tiangolo/uvicorn-gunicorn:python3.11 | |||
FROM python:3.11 AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using 'python:3.11-slim' in the builder stage for consistency and reduced image size
Since you're using python:3.11-slim
in the final stage, you might consider using it in the builder stage as well to maintain consistency and potentially reduce the overall image size. Ensure that all necessary build tools and libraries are available in the slim image for successful dependency installation.
Apply this diff to implement the suggestion:
-FROM python:3.11 AS builder
+FROM python:3.11-slim AS builder
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
FROM python:3.11 AS builder | |
FROM python:3.11-slim AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (5)
backend/Dockerfile (4)
6-9
: Good practices for dependency management, with a minor suggestionThe approach for handling dependencies is well-structured:
- Installing git for potential VCS-based dependencies.
- Copying only the requirements file for better layer caching.
- Using the --upgrade flag to ensure up-to-date packages.
Consider combining the git installation with other necessary build tools in a single RUN command to reduce image layers:
-RUN apt-get update && apt-get -y install git +RUN apt-get update && apt-get -y install git build-essential && rm -rf /var/lib/apt/lists/*This change would also clean up the apt cache, further reducing the image size.
16-16
: Efficient package installation with a suggestion for documentationThe installation of additional packages is done efficiently in a single RUN command, which is a good practice for minimizing image layers. Cleaning up the apt cache also helps reduce the final image size.
Consider adding comments to explain why each of these packages (ffmpeg, curl, unzip) is necessary for the application. This will help future maintainers understand the dependencies and potentially optimize the image further if any of these become unnecessary.
Example:
# Install required packages: # - ffmpeg: for audio/video processing # - curl: for downloading resources # - unzip: for extracting downloaded archives RUN apt-get update && apt-get -y install ffmpeg curl unzip && rm -rf /var/lib/apt/lists/*
18-22
: Efficient final setup with a minor optimization suggestionThe final steps of the Dockerfile are well-structured:
- Copying the virtual environment from the builder stage is an efficient way to transfer dependencies.
- Copying the backend code leverages Docker's layer caching.
- Exposing port 8080 and using uvicorn to run the application remain unchanged and appropriate.
Consider optimizing the COPY command for the backend code:
-COPY backend/ . +COPY backend/ ./This explicit use of
./
as the destination ensures that the contents of the backend directory are copied directly into the current working directory (/app), which can prevent potential issues with nested directories.
1-22
: Excellent optimization of the backend Docker imageThis updated Dockerfile significantly improves the build process and final image:
- Implements a multi-stage build, separating the build environment from the runtime environment.
- Uses slim images to reduce the final image size.
- Leverages virtual environments for better dependency isolation.
- Optimizes layer caching through strategic ordering of commands.
- Combines RUN commands to reduce the number of layers.
These changes will likely result in a smaller, more efficient Docker image and faster build times, aligning perfectly with the PR objective of optimizing the backend Docker image.
To further improve the CI/CD pipeline:
- Consider implementing Docker layer caching in your CI/CD process to speed up builds even more.
- Regularly update the base images and dependencies to ensure you're using the latest security patches.
- Implement a vulnerability scanning step in your pipeline to catch any security issues in the dependencies or base images.
.github/workflows/gcp_backend.yml (1)
Line range hint
31-37
: Great addition of environment validation!The new step to validate the environment input is an excellent safety measure. It prevents accidental deployments to incorrect environments, which aligns well with best practices for CI/CD pipelines.
A minor suggestion to improve readability:
Consider using a case statement for better readability and easier expansion in the future:
- if [[ "${{ github.event.inputs.environment }}" != "development" && "${{ github.event.inputs.environment }}" != "prod" ]]; then - echo "Invalid environment: ${{ github.event.inputs.environment }}. Must be 'development' or 'prod'." - exit 1 - fi + case "${{ github.event.inputs.environment }}" in + development|prod) + ;; + *) + echo "Invalid environment: ${{ github.event.inputs.environment }}. Must be 'development' or 'prod'." + exit 1 + ;; + esacAlso applies to: 38-38
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- .github/workflows/gcp_backend.yml (1 hunks)
- backend/Dockerfile (1 hunks)
🧰 Additional context used
🔇 Additional comments (7)
backend/Dockerfile (2)
1-4
: Excellent implementation of multi-stage build with slim images!Great job on implementing the multi-stage build process and using slim images. This change addresses the previous suggestion and will help reduce the final image size. The creation of a virtual environment is a best practice for dependency isolation.
11-15
: Well-structured final stage setupThe final stage is set up efficiently:
- Using a slim Python image reduces the final image size.
- Setting the working directory to /app is a good practice.
- Updating the PATH ensures the virtual environment is correctly used.
These changes contribute to a more optimized and maintainable Docker image.
.github/workflows/gcp_backend.yml (5)
Line range hint
3-17
: Excellent update to workflow trigger and inputs!The change to
workflow_dispatch
with custom inputs for environment and branch selection is a great improvement. This allows for:
- More controlled deployments through manual triggering.
- Flexibility in choosing the deployment environment and branch.
- Better alignment with different stages of your development lifecycle.
These changes contribute to the overall optimization of the backend deployment process, as mentioned in the PR objectives.
39-41
: Excellent updates to action versions and new steps!
- Updating the checkout action to v4 and Google Auth action to v2 is great for leveraging the latest features and security improvements.
- The new step to delete the
/opt/hostedtoolcache
directory is a smart workaround for potential storage issues on GitHub-hosted runners.- Adding a separate step for logging into Google Container Registry (GCR) improves clarity and maintainability.
These changes align well with the PR objective of optimizing the backend deployment process and addressing deprecated errors.
The previous review comment about updating the checkout action has been addressed in this PR.
Also applies to: 42-42, 44-44, 48-48, 52-54
55-67
: Excellent optimization of Docker build process!The new Docker build and push process is a significant improvement:
- Using
docker/setup-buildx-action@v3
enables advanced features like multi-platform builds and improved caching.- The
docker/build-push-action@v6
with registry caching options is a key factor in achieving the 400% faster build time mentioned in the PR objectives.- The cache-from and cache-to configurations ensure that subsequent builds can reuse layers, further optimizing the process.
These changes directly address the main goal of this PR to optimize the backend Docker image and significantly reduce build times.
70-70
: Approved: Cloud Run deployment action updateUpdating the Google Cloud Run deployment action from v0 to v2 is an excellent change. This ensures you're using the latest features and security improvements provided by the action. It also addresses the PR objective of resolving deprecated errors.
Line range hint
1-79
: Summary: Excellent optimization and modernization of the backend deployment workflowThis PR successfully achieves its objectives:
- Significantly reduces build time by implementing Docker Buildx and registry caching.
- Addresses deprecated errors by updating action versions.
- Improves workflow flexibility with manual triggers and environment selection.
- Enhances security and reliability with environment validation and up-to-date actions.
These changes collectively result in a more efficient, secure, and maintainable deployment process for the backend service. Great work on this optimization!
bfdc8c4
to
764c73a
Compare
lgtm 🥳 |
Issue: #1064
Key changes:
-> Reduce build time from 1x minutes to ~3 minutes: 400% faster build
before
after
Summary by CodeRabbit
New Features
Bug Fixes
Refactor